Perhaps the best (most principled) plotting library around
Here are the references/resources.
ggplot(data=incex) + geom_bar(mapping = aes(x = occ))
geom_bar() uses stat_count() by default. We can reproduce the barplot using stat_count().ggplot(incex) + stat_count(aes(x = occ))
identity stat.occ_counted <- incex %>% count(occ, name = 'count')
occ_counted
occ count
1 low 595
2 med. 700
3 high 627
ggplot(occ_counted) + geom_bar(aes(x = occ, y = count), stat = 'identity')
ggplot(occ_counted) + geom_col(aes(x = occ, y = count)) # shortcut
after_stat() to modify mapping from stats (here we calculate %).ggplot(incex) + geom_bar(aes(x = occ,
y = after_stat(100 * count / sum(count))))
ggplot(occ_counted) + geom_bar(aes(x="", y = count, fill=occ), stat = "identity")
ggplot(incex) + geom_bar(aes(x = occ, fill=sex))
ggplot(incex) + geom_bar(aes(x = occ, fill=sex)) +
scale_fill_brewer(palette="Dark2")
ggplot(incex) + geom_bar(aes(x = occ, fill=sex)) +
scale_fill_manual(values=c("#999999", "#E69F00"))
position = "fill" makes “proportion bars”.ggplot(occ_counted) + geom_bar(aes(x="", y = count, fill=occ), stat = "identity", position="fill") + labs(y="proportion")
ggplot(incex) + geom_bar(aes(x = occ, fill=sex), position = "fill") +
labs(x="occupational status", y="proportion", title="stacked plot")
x and y as radius and angle.occ_counted <- incex %>% count(occ, name = 'count')
p <- ggplot(occ_counted) + geom_bar(aes(x="", y = count, fill=occ),
stat = 'identity', width=1)
p + coord_polar(theta = "y")
theme. Here, theme_void() removes background, grid, numeric labels.p + coord_polar(theta = "y") +
theme_void()
ggplot(incex) + geom_histogram(aes(x = income), bins = 20) + theme_minimal()
ggplot(incex) + geom_density(aes(x = income)) + theme_minimal()
ggplot(incex) + geom_histogram(aes(x = income, y=..density..), bins=20) +
geom_density(aes(x = income), col="red") + theme_minimal()
ggplot.ggplot(incex, aes(x = income, y=..density..)) +
geom_histogram(bins=20) + geom_density(col="red") + theme_minimal()
ggplot(incex) + geom_histogram(aes(x = income, fill=sex),
bins=20, alpha=0.7) + theme_bw()
ggplot(incex, aes(x = income)) + geom_boxplot() + theme_bw()
ggplot(incex, aes(x = income)) + geom_boxplot() + theme_bw() +
theme(axis.text.y=element_blank(),
axis.ticks.y=element_blank())
ggplot(incex, aes(x = income, y = sex)) + geom_boxplot() + theme_bw()
ggplot(incex, aes(x = income, y = sex)) + geom_violin() + theme_bw()
ggplot(incex, aes(x = income, y = sex)) + geom_violin() +
geom_point(alpha=0.2) + theme_bw()
ggplot(incex, aes(x = income, y = sex)) + geom_violin() +
geom_jitter(width = 0.1, height = 0.1, alpha=0.2) + theme_bw()
ggplot(incex, aes(sample = income)) + stat_qq() + stat_qq_line() + theme_classic()
geom_point() with two continuous variablesggplot(incex, aes(x = age, y = income)) + geom_point() + theme_bw()
ggplot(incex, aes(x = age, y = income)) + geom_jitter() + theme_bw()
alpha controls the degree of transparency for data points.ggplot(incex, aes(x = age, y = income)) + geom_point(alpha=0.3) + theme_bw()
col or shape inside aes().ggplot(incex, aes(x = age, y = income, col=sex)) +
geom_point(alpha=0.3) + theme_bw()
ggplot(incex, aes(x = age, y = income, shape=sex)) +
geom_point(alpha=0.3) + theme_bw()
Two facet functions for splitting data by categories
facet_wrap() : “wraps” a 1d ribbon of panels into 2d.
facet_grid() : produces a 2d grid of panels defined by variables which form the rows and columns.
facet_wrap() with one categorical variableggplot(incex, aes(x = age, y = income)) + geom_point() +
facet_wrap(~ sex)
facet_wrap() with two categorical variables: sex and eduggplot(incex, aes(x = age, y = income)) + geom_point() +
facet_wrap(sex ~ edu)
facet_wrap() with two categorical variables: occ and eduggplot(incex, aes(x = age, y = income)) + geom_point() +
facet_wrap(occ ~ edu)
facet_grid() with variables sex and edu.ggplot(incex, aes(x = age, y = income)) + geom_point() +
facet_grid(sex ~ edu)
facet_grid() with variables occ and edu.ggplot(incex, aes(x = age, y = income)) + geom_point() +
facet_grid(occ ~ edu)
theme!ggplot(incex) + geom_bar(aes(y = occ)) + facet_wrap(~ sex) +
labs(title = "Number of occupational status by gender",
caption = "source: DS3003, Fall 2021",
x = NULL,
y = NULL) +
theme_minimal() +
theme(
strip.text = element_text(face = 'bold', hjust = 0),
plot.caption = element_text(face = 'italic'),
panel.grid.major = element_line('white', size = 0.5),
panel.grid.minor = element_blank(),
panel.grid.major.y = element_blank(),
panel.ontop = TRUE
)
ggplot2 is huge! About 50 geoms, 25 stats, 60 scales.
Many extensions are very niche specific and developed by experts in the field.
100 registered extensions available to explore (https://exts.ggplot2.tidyverse.org)
e.g., for plot compositions, you might want to use gridExtra::grid.arrange(), ggpubr::ggarrange(), cowplot::plot_grid(), or patchwork.
library(gridExtra)
p1 <- ggplot(incex) + geom_bar(aes(x = occ, fill=sex)) # barplot
p2 <- ggplot(incex) + geom_histogram(aes(x=income), bins=10) # histogram
p3 <- ggplot(incex) + geom_point(aes(x = age, y=income)) # scatterplot
grid.arrange(p1, p2, p3, nrow=1)
library(patchwork)
p1 + p2 + p3
( p1 + p2 ) / p3
library(GGally)
ggpairs(incex[, c('income', 'oexp', 'age', 'edu')])